The Intelligent Data Analysis lab - The Intelligent Data Analysis lab

Machine Learning and Artificial Intelligence

Learning Theory

Learning Theory

Deep Relational Learning

Deep Relational Learning

Symbolic AI

Symbolic AI

Algorithms and Systems

Deep Learning

Deep Learning

Optimization

Optimization

Computational Biology

Computational Biology

Applications

Sports Analytics

Sports Analytics

Bioinformatics

Bioinformatics

Biostatistics

Biostatistics

Using well grounded theoretical principles, we formulate algorithms for optimizing challenging problems arising from a variety of settings. In particular, we develop continuous constrained optimization algorithms as used in engineering design, online pathfollowing procedures for use in process control, and have a particular focus towards stochastic (random) optimization with a focus towards machine learning. This includes developing methods superior to the standard stochastic gradient for particular settings, using parallel hardware to speed up training in wall clock time, distributed and federated learning, and posterior sampling.

People

Card image cap

Vyacheslav Kungurtsev

Card image cap

Harsh Choudhary

Alumni

Card image cap

Radomír Černoch

Selected papers

Cannelli, L., Facchinei, F., Kungurtsev, V., & Scutari, G. (2019). Asynchronous Parallel Algorithms for Nonconvex Big-Data Optimization: Model and Convergence. Mathematical Programming, 1–34. http://www.optimization-online.org/DB_FILE/2017/01/5823.pdf [details]
Kungurtsev, V., & Jäschke, J. (2017). A Predictor-Corrector Path-Following Algorithm for Dual-Degenerate Parametric Optimization Problems. SIAM Journal on Optimization, 27(1), 538–564. https://doi.org/10.1137/16M1068736 [details]
Černoch, R., Kuželka, O., & Železný, F. (2016). Polynomial and Extensible Solutions in Lock-Chart Solving. Applied Artificial Intelligence, 30(10), 923–941. https://dl.acm.org/doi/abs/10.5555/3141714.3141716 [details]

We develop algorithms for calculating the similarity of partially assembled data. Classical applications of those algorithms include phylogeny and other clustering techniques; however, the measure is transferable to other machine learning approaches as classification. Besides that, we are interested in applying relational and logic-based learning algorithms to biological estimation and prediction problems with hybrid (real and discrete) and structured data and expressive background knowledge.

People

Card image cap

Jiří Kléma

Card image cap

Filip Železný

Card image cap

Petr Ryšavý

Card image cap

Alikhan Anuarbekov

Alumni

Card image cap

František Malinka

Selected papers

Ryšavý, P., & Železný, F. (2023). Reference-free phylogeny from sequencing data. BioData Mining, 16(1), 13. https://doi.org/10.1186/s13040-023-00329-x [details]
Ryšavý, P., & Železný, F. (2019). Estimating sequence similarity from read sets for clustering next-generation sequencing data. Data Mining and Knowledge Discovery, 33(1), 1–23. https://doi.org/10.1007/s10618-018-0584-8 [details]
Kléma, J., Malinka, F., & Železný, F. (2017). Semantic biclustering for finding local, interpretable and predictive expression patterns. BMC Genomics, 18(7), 752. https://doi.org/10.1186/s12864-017-4132-5 [details]
Anděl, M., Kléma, J., & Krejčík, Z. (2015). Network-constrained forest for regularized classification of omics data. Methods, 83, 88–97. https://doi.org/https://doi.org/10.1016/j.ymeth.2015.04.006 [details]

In the application domain of predictive sports analytics, we benefit from our development of machine learning models and their interconnection with mathematical optimization tasks. Our core testbed application is exploiting betting markets with a proper combination of the two techniques, such as in the project of End-to-end learning of optimal portfolios. We further focus on creating novel predictive models for match outcomes across different sports domains (football, basketball, tennis), input statistics (scores, features, ratings), levels of granularity (individual, team), and game mechanics (standard, e-sports, fantasy sports).

People

Card image cap

Gustav Šír

Alumni

Card image cap

Ondřej Hubáček

Card image cap

Matěj Uhrín

Selected papers

Hubáček, O., Šourek, G., & Železný, F. (2019). Learning to predict soccer results from relational data with gradient boosted trees. Machine Learning, 108(1), 29–47. https://doi.org/10.1007/s10994-018-5704-6 [details]
Hubáček, O., Šourek, G., & Železný, F. (2019). Exploiting sports-betting market using machine learning. International Journal of Forecasting, 35(2), 783–796. https://doi.org/10.1016/j.ijforecast.2019.01.001 [details]

We develop methods and software tools for understanding molecular biology data. We specialize in machine learning and statistical methods. One of our main goals is to develop interpretable and accurate models from datasets where the number of samples is much smaller that the number of features. One of the main issues is to avoid overfitting, we minimize it with the aid of prior knowledge available in bioinformatics databases (gene annotations, interaction databases, metabolic and signaling pathways etc.).

People

Card image cap

Jiří Kléma

Card image cap

Filip Železný

Card image cap

Petr Ryšavý

Card image cap

Alikhan Anuarbekov

Card image cap

Fadwa Idlahcen

Alumni

Card image cap

Anh Vu Le

Card image cap

František Malinka

Selected papers

Le, A. V., Větrovský, T., Barucic, D., Saraiva, J. P., Dobbler, P. T., Kohout, P., … Baldrian, P. (2023). Improved recovery and annotation of genes in metagenomes through the prediction of fungal introns. Molecular Ecology Resources, 23(8), 1800–1811. https://doi.org/https://doi.org/10.1111/1755-0998.13852 [details]
Kléma, J., Malinka, F., & Železný, F. (2017). Semantic biclustering for finding local, interpretable and predictive expression patterns. BMC Genomics, 18(7), 752. https://doi.org/10.1186/s12864-017-4132-5 [details]
Anděl, M., Kléma, J., & Krejčík, Z. (2015). Network-constrained forest for regularized classification of omics data. Methods, 83, 88–97. https://doi.org/https://doi.org/10.1016/j.ymeth.2015.04.006 [details]

We develop and apply statistical methods to various biological problems. We specialize in omics data analysis and fusion, our typical goal is to interpret these data in terms of simplified and understandable statistical models. A simple example is differential gene expression analysis followed by enrichment analysis.

People

Card image cap

Jiří Kléma

Alumni

Card image cap

Vladimír Kunc

Selected papers

Hruba, P., Klema, J., Le, A. V., Girmanova, E., Mrazova, P., Massart, A., … Viklicky, O. (2023). Novel transcriptomic signatures associated with premature kidney allograft failure. EBioMedicine, 96. https://doi.org/10.1016/j.ebiom.2023.104782 [details]
Ryšavý, P., Kléma, J., & Merkerová, M. D. (2022). circGPA: circRNA functional annotation based on probability-generating functions. BMC Bioinformatics, 23(1), 392. https://doi.org/10.1186/s12859-022-04957-8 [details]
Brzicova, T., Sikorova, J., Milcova, A., Vrbova, K., Klema, J., Pikal, P., … Rossner, P. (2019). Nano-TiO2 stability in medium and size as important factors of toxicity in macrophage-like cells. Toxicology in Vitro, 54, 178–188. https://doi.org/https://doi.org/10.1016/j.tiv.2018.09.019 [details]
Wohlfahrtova, M., Hruba, P., Klema, J., Novotny, M., Krejcik, Z., Stranecky, V., … Viklicky, O. (2018). Early isolated V-lesion may not truly represent rejection of the kidney allograft. Clinical Science, 132(20), 2269–2284. https://doi.org/10.1042/CS20180745 [details]

Learning theory studies why machine learning works. We study learning theory in the relational learning setting where many standard assumptions, such as the “i.i.d. assumption”, do not hold. This makes the studied problems more challenging but also more interesting. We also study how natural aspects of learning such as generalization of facts into rules in the presence of background knowledge can be modeled in formal logic frameworks. We contribute to the fields of inductive logic programming, statistical relational learning and related areas.

People

Card image cap

Filip Železný

Card image cap

Ondřej Kuželka

Card image cap

Jianhang Ai

Václav Kůla

Alumni

Card image cap

Jáchym Barvínek

Selected papers

Shevchenko, A., Kungurtsev, V., & Mondelli, M. (2022). Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks. Journal of Machine Learning Research, 23(130), 1–55. http://jmlr.org/papers/v23/21-1365.html [details]
Ai, J., Kuželka, O., & Wang, Y. (2023). Hoeffding–Serfling Inequality for U-Statistics Without Replacement. Journal of Theoretical Probability, 36(1), 390–408. [details]
Kuželka, O., & Davis, J. (2019). Markov Logic Networks for Knowledge Base Completion: A Theoretical Analysis Under the MCAR Assumption. In Proceedings of the Thirty-Fifth Conference on Uncertainty in Artificial Intelligence, UAI (p. 427). http://auai.org/uai2019/proceedings/papers/427.pdf [details]
Kuzelka, O., Kungurtsev, V., & Wang, Y. (2020). Lifted weight learning of markov logic networks (revisited one more time). In International Conference on Probabilistic Graphical Models (pp. 269–280). PMLR. [details]

We study machine-learning algorithms that learn hypotheses and models represented through human-readable languages such as first-order logic. We also try to endow non-symbolic frameworks such as neural networks with symbolic elements to improve the interpretability of the learned models. We contribute to the fields of inductive logic programming, statistical relational learning and related areas.

People

Card image cap

Filip Železný

Card image cap

Ondřej Kuželka

Card image cap

Fadwa Idlahcen

Václav Kůla

Card image cap

Jan Tóth

Alumni

Card image cap

Jáchym Barvínek

Card image cap

Martin Svatoš

Selected papers

Wang, Y., Pu, J., Wang, Y., & Kuzelka, O. (2023). On Exact Sampling in the Two-Variable Fragment of First-Order Logic. In LICS (pp. 1–13). https://doi.org/10.1109/LICS56636.2023.10175742 [details]
Svatos, M., Jung, P., Tóth, J., Wang, Y., & Kuzelka, O. (2023). On Discovering Interesting Combinatorial Integer Sequences. In Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI 2023, 19th-25th August 2023, Macao, SAR, China (pp. 3338–3346). ijcai.org. https://doi.org/10.24963/IJCAI.2023/372 [details]
Kuzelka, O. (2021). Weighted first-order model counting in the two-variable fragment with counting quantifiers. Journal of Artificial Intelligence Research, 70, 1281–1307. [details]
Kuzelka, O. (2023). Counting and Sampling Models in First-Order Logic. In Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI 2023, 19th-25th August 2023, Macao, SAR, China (pp. 7020–7025). ijcai.org. https://doi.org/10.24963/IJCAI.2023/801 [details]

Icons made by Freepik from www.flaticon.com.

Tweets by IDA_CTU